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ABSTRACT 

The  main  emphasis  of  this  tutorial  paper  is  on  the  formulation  of 
appropriate  state-space  models  for  Kalman  filtering 
applications.  The  so-called  "model"  is  completely  specified  by 
four  matrix  parameters  and  the  initial  conditions  of  the 
recursive  equations.  Once  these  are  determined,  the  die  is  cast, 
and  the  way  in  which  the  measurements  are  weighted  is  determined 
foreveraf ter.  Thus,  finding  a  model  that  fits  the  physical 
situation  at  hand  is  all  important.  Also,  it  Is  often  the  most 
difficult  aspect  of  designing  a  Kalman  filter.  Formulation  of 
discrete  state  models  from  the  spectral  density  and  ARMA  random 
process  descriptions  is  discussed.  Finally,  it  is  pointed  out 
that  many  common  processes  encountered  in  applied  work  (such  as 
band-limited  white  noise)  simply  do  not  lend  themselves  very  well 
to  Kalman  filter  modeling. 


INTRODUCTION 


Kalman  filtering  is  now  well  known,  and  tutorial  discussions  of  the  tech¬ 
nique  are  given  in  a  number  of  standard  references  [1,2,3].  The  filter 
recursive  equations  are  summarized  in  Figure  1  for  reference  purposes  here. 
It  should  be  noted  that  once  the  initial  conditions  and  the  <|>^,  H^,  Q^, 
parameters  are  specified,  the  die  is  cast  and  the  way  in  which  the 
measurement  sequence  is  processed  is  completely  determined.  Thus,  the 
specification  of  these  parameters  is  especially  important  —  they  are,  in 
effect,  the  filter  "model".  The  emphasis  in  this  tutorial  paper  will  be  on 
the  modeling  aspect  of  Kalman  filtering.  To  see  where  these  parameters  come 
from,  we  will  now  review  the  basic  process  and  measurement  equations. 
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Figure  1  Kalman  filter  loop 
THE  DISCRETE  PROCESS  AND  MEASUREMENT  EQUATIONS 

The  starting  point  for  discrete  Kalman  filter  theory  begins  with  the  process 
and  measurement  equations.  The  random  process  under  consideration  is 
assumed  to  satisfy  the  following  recursive  equation 

”  Vk +  ”k  (l) 

where  k  refers  to  the  k-th  step  in  time,  x^  is  a  vector  random  process, 
<j>^  is  the  transition  matrix,  and  w^  is  a  Gaussian  white  sequence  with  a 
covariance  structure  given  by 

E[xkXk]  =  Qk  (2) 

The  measurement  relationship  is  assumed  to  be  of  the  form 

‘  Vk +  Tk  <3) 

where  v^  is  also  a  Gaussian  white  sequence,  uncorrelated  with  w^,  and 
described  by  the  covariance 


ElVk>  ■  \  (4) 

In  words,  then,  the  key  parameters  of  a  Kalman  filter  model  can  be  described 
as  follows: 
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(1)  <j)  is  Che  transition  matrix  that  describes  the  natural  dynamics  of 
the  process  in  going  from  step  k  to  k+i. 

(2)  is  the  linear  connection  matrix  that  gives  the  ideal 
(noiseless)  relationship  between  the  measurement  z^  and  the 
process  to  be  estimated  x^. 

(3)  describes  the  additional  noise  that  comes  into  the  x^  process 
in  the  At  interval  between  step  k  and  k+1. 

(4)  describes  additive  measurement  noise. 

It  is  important  to  note  that  the  discrete  model  described  by  Eqs.  (1) 
through  (4)  stands  in  its  own  right.  It  is  not  an  approximation  of  some 
continuous  system,  nor  does  it  have  to  be  related  to  another  conti  nuous 
linear  dynamical  system  in  any  way.  Once  the  discrete  model  is  assumed,  the 
recursive  estimation  process  given  in  Fig.  1  follows  directly. 

IMPORTANCE  OF  THE  GAUSSIAN  ASSUMPTION 

We  will  digress  for  a  moment  and  look  at  the  Gaussian  assumption  used  in 
Eqs.  (1)  through  (4).  If  w,  and  are  Gaussian  white  sequences,  then 
and  will  be  Gaussian  processes.  Even  though  the  Gaussian  assumption  js 
often  omitted  in  discussions  of  least-squares  filtering,  we  make  here  with 
no  apology.  The  reason  for  this  is  that  minimizing  the  mean  square  error 

really  does  not  make  very  good  sense  for  non-Gaussian  processes.  To 

illustrate  this,  consider  the  two  processes  shown  in  Fig.  2.  The  first  is  a 
scalar  Gauss-Markov  process  which  has  the  general  appearance  of  typical 
noise.  The  second  process  is  the  random  telegraph  wave  which  switches 
between  +1  and  -1  at  random  points  in  time.  If  the  parameters  of  the  two 
processes  are  adjusted  appropriately,  they  can  be  made  to  have  identical 
power  spectral  density  functions.  Yet,  they  are  radically  different 
processes!  The  least-squares  prediction  far  out  into  the  future  is  zero  for 
both  cases.  This  makes  good  sense  in  the  Gauss-Markov  case  because  zero  is 
the  mean  and  most  likely  value.  On  the  other  hand,  it  is  ridiculous  to 

predict  zero  in  the  random  telegraph  wave  case.  We  know  a  priori  that  this 
waveform  is  never  zero.  We  would  be  better  off  to  predict  either  +1  or  -1 

and  be  correct  half  the  time  than  to  predict  zero  and  be  wrong  all  the  time! 

Thus,  the  Gaussian  assumption  is  a  reasonable  one  in  the  least  squares 
estimation  theory,  and  to  stray  from  it  leads  us  into  dangerous  territory. 
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Figure  2  Gauss-Markov  and  random  telegraph  waves 


TRANSITION  FROM  A  SPECTRAL  DESCRIPTION  TO  A  DISCRETE  STATE  MODEL 


In  Kalman  filter  applications,  we  frequently  begin  with  a  spectral  descrip¬ 
tion  of  the  various  random  processes  involved.  The  problem  then  is  to 
convert  this  information  to  a  model  of  the  form  specified  by  Eqs,  (1) 
through  (4).  The  general  procedure  for  making  the  transition  to  the 
discrete  model  is  as  follows: 

(1)  Look  for  a  continuous  dynamical  system  that  yields  the  desired 
process  when  driven  by  white  noise.  (The  white  noise  input 
assures  that  w^  will  be  a  white  sequence.  ) 

(2)  Then  write  the  dynamical  equations  in  state-space  form: 

k  =  Ax  +  Bu  (5) 

(3)  Solve  the  state  equations  for  step  size  At  and  obtain 


■  Vk +  "k 


(4)  Determine  the  measurement  equation  from  the  particular  situation 
at  hand. 


To  illustrate  the  procedure  further,  suppose  the  y  process  power  spectral 

r) 

density  function  S  (s)  can  be  written  as  a  ratio  of  polynomials  in  s'"  (or 
2  2  2  y 

m  ,  where  m  =*  -s  ).  The  spectral  function  can  then  always  be  factored  into 
two  symmetric  parts,  one  with  its  poles  and  zeros  in  the  left-half  s  plane, 
the  other  with  mirror-image  poles  and  zeros  in  the  right-half  plane.  This 
is  called  spectral  factorization  and  is  represented  mathematically  as 
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(7) 


S  (s)  =  S+(s)*S  (s) 

y  y  y 

where  S+  and  S  are  the  left-  and  right-half  plane  parts  respectively. 

,  y  y 

S  (s)  then  becomes  the  shaping  filter  that  will  shape  unity  white  noise  into 
a  process  y(t)  with  a  spectral  function  S  (s).  (See  Ref.  [1]  for  further 
details. ) 

Now  suppose  that  the  shaping  filter  is  of  the  form  shown  in  Fig.  3.  We  seek 
a  state-space  model  for  that  dynamical  system.  One  way  of  achieving  this  is 


Figure  3  Shaping  filter 


shown  in  block  diagram  form  in  Fig.  4.  The  state-space  equations  are  then 


Figure  4  Shaping  filter  redrawn 


265 


Control  system  engineers  refer  to  this  as  the  controllable  canonical  form, 
and  it  can  always  be  achieved  for  the  dynamical  system  as  shown  in  Fig.  3. 
If  y  is  the  process  that  is  actually  measured,  then  the  H  matrix  is  just  the 
row  matrix  of  b's  given  in  Eq.  (9). 

EXAMPLE 

Suppose  we  have  a  scalar  Gauss-Markov  process  y(t)  whose  power  spectral 
density  function  is 

V>  ■  <«  t4>  <‘0> 

-s  +  0  <a  +  g 

We  first  factor  Sy  as  follows: 

„  ,  x  7 2 o2e  J 2a 2g 

S  (s)  =  -~~r  •  - ~r-  (11) 

y  s  +  3  -s  +  3 

The  shaping  filter  is  then  (s+3 )  which  corresponds  to  the  dynamical 


equation 


y  +  3y=v2o  3  w(t) 


This  is  a  simple  first  order  differential  equation,  so  we  only  have  one 
state  variable.  Call  it  x^.  Our  state  equation  is  then 


=  -3x^  +  \J  2a  3  w(t) 


The  solution  of  this  equation  for  a  step  size  At  is 

-  _eAt  i 

Vi  e  \  +  \  (14) 

and  can  be  seen  Co  be  Che  cransltlon  macrtx  $  .  The  mean  square  value 

of  w^  can  be  determined  from  random  process  theory  [1],  and  it  works  out  to 
be 
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(15) 


Qk  -  E[wk] 


=  a  2  ( 1  - 


-23AtN 
•e  ) 


The  process  model  is  now  complete. 

UNIQUENESS 

We  might  pose  a  question  at  this  point: 

Are  Kalman  filter  models  unique? 

The  answer  is  an  emphatic  NO.  We  know  from  linear  system  theory  that  any 
nonsingular  linear  transformation  on  the  state  vector  leads  to  another 
equally  legitimate  state  vector.  The  choice  of  coordinate  frame  for 
performing  the  estimation  process  is  purely  a  matter  of  convenience. 
Optimal  estimates  can  be  transformed  freely  from  one  coordinate  frame  to 
another  (through  a  linear  transformation)  and  still  remain  optimal  estimates 
in  the  new  frame  of  reference. 


ARMA  MODEL 

Sometimes  the  random  process  model  comes  to  us  in  the  form  of  a  difference 
equation  rather  than  a  continuous  differential  equation.  For  example, 
consider  the  auto-regressive  moving  average  (ARMA)  model  that  relates  a 
discrete  process  y(k)  to  an  input  white  sequence  u(k). 

y(k+n)  +  a^y(k+n~l)  +  **•  a^y(k)  =  0^u(k+n~l)  +  •••  S^u(k)  (L6) 


There  is  a  close  analogy  between  difference  and  differential  equations,  and 
it  works  out  that  this  nth-order  difference  equation  can  be  converted  to 
vector  form  in  much  the  same  manner  as  for  a  differential  equation.  If  we 
define  an  intermediate  variable  y' (k)  as  the  solution  to  Eq.  (16)  with  just 
u(k)  as  the  driving  function,  and  then  define  our  state  variables  as 

x1(k)  =  y^(k),  x2(lt)  =  y'"’  ( k+ 1 ) ,  etc.  (17) 
then  the  system  of  Eq.  (16)  translates  into  state-space  form  as 
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(18) 


y(k)  =  [0n  3n_1 


x^k) 

x2(k) 


xn(k) 


(19) 


Note  that  our  choice  of  state  variables  leads  to  the  controllable  canonical 
form,  just  as  in  the  continuous  dynamical  case.  Of  course,  we  could  have 
defined  our  state  variables  differently  and  arrived  at  a  form  different  from 
Eqs.  (18)  and  (19).  We  will  not  pursue  this  further  other  than  to  say  the 
choice  of  state  variables  is  (within  limits)  a  matter  of  convenience  for  the 
situation  at  hand. 


PROCESSES  DERIVED  FROM  IRRATIONAL  SHAPING  FILTERS 

The  random  process  modeling  procedures  discussed  thus  far  have  been 
straightforward.  They  may  be  tedious  for  higher-order  processes,  but  they 
do  not  call  for  much  imagination.  There  exists,  however,  a  whole  class  of 
processes  where  this  is  not  the  case.  These  are  the  processes  that  cannot 
be  thought  of  as  the  result  of  passing  vector  white  noise  through  a  linear 
dynamical  system  of  finite  order.  Such  processes  are  commonplace  in 
engineering  literature.  For  example,  bandlimited  Gaussian  white  noise  is  a 
very  useful  abstraction  in  communication  theory.  It  is  Gaussian  noise  that 
has  a  flat  spectrum  in  the  baseband  and  then  is  zero  out  beyond  the  cutoff 
frequency.  It  can  be  thought  of  as  the  result  of  passing  pure  white  noise 
through  an  idealized  lowpass  filter,  but  no  such  filter  can  be  represented 
as  a  ratio  of  polynomials  In  s  of  finite  order.  (Note  that  a  Butterworth 
filter  can  be  made  to  approximate  the  ideal  case,  but  not  equal  it.)  The 
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idealizations  of  bandlimited  white  noise  are  often  a  convenience  in  communi¬ 
cation  theory;  however,  they  are  an  obstruction  in  Kalman  filter  theory. 


There  is  a  theorem  from  linear  systems  theory  that  is  useful  at  this  point. 
Chen  [4]  gives  us  the  following  criterion  for  the  realization  of  linear 
dynamical  models. 

A  linear  dynamical  model  of  the  form 


x  =  Ax  +  Bu 
y  =  Cx  +  Du 


(20) 


will  exist  for  a  system  with  an  input-output  impulsive  response  G(t,x), 
if  and  only  if,  G(t,r)  is  factorable  in  the  form 

G(t,r)  «  X(t)N(x)  (21) 

M  and  N  are  finite-order  matrices,  so  if  G(t,x)  is  scalar  (i.e.,  single¬ 
input,  single-output),  M(t)  is  a  row  vector  and  N(t)  is  a  column  vector. 
This  theorem  can  then  be  used  as  a  test  to  see  if  a  dynamical  system  will 
exist  for  a  corresponding  impulsive  response  function.  Furthermore,  the 
factorization  provides  the  necessary  information  for  realization  of  the 
model.  (See  Chen  [4]  for  further  details.)  We  will  use  flicker  noise  to 
illustrate  the  use  of  Chen's  theorem.  Flicker  noise  is  of  special  interest 
to  the  PTTI  community  because  of  its  presence  in  precision  frequency 
standards.  T.t  is  characterized  by  a  power  spectral  density  function  of  the 
form  of  1/f  at  the  frequency  level,  or  1 / f "  when  referred  to  the  phase  level 
[5,6],  A  block  diagram  showing  the  relationship  between  flicker  noise  and 
white  noise  is  given  in  Fig.  5. 


White 

Noise 

w(t) 


1 

. . \ 

1 

Sl/2 

Frequency 

s 

phase 


( t ime ) 


Figure  5  Block  diagrams  relating  flicker  noise  to  white  noise 

Clearly,  the  transfer  function  relating  input  white  noise  to  the  output 
3/2  3/2 

phase  x(t)  is  1/s  '  .  The  inverse  transform  of  1/s  '  gives  the  impulsive 
response  to  an  impulse  applied  at.  t=0.  This  is  2/t//x.  Thus,  for  an 
impulse  applied  at  t=x ,  we  have  (in  Chen's  notation) 


G(t  ,X  )  =  -  / t~X  ,  t  >  T 

/tT 


(22) 


The  question  is,  "Is  G(t,x)  factorable  in  the  form  M(t)N(x)?"  It  appears 
that  it  is  not,  although  this  is  difficult  to  show  in  a  rigorous  sense. 
This  being  the  case,  Chen's  theorem  says  that  no  linear  dynamical  system 
will  exist  that  corresponds  to  the  G(t,x)  of  Eq.  (22).  This  is  to  say  that 
no  finite-order  state  model  will  exactly  represent  flicker  noise!  Of 
course,  the  state  model  is  essential  for  Kalman  filtering,  so  this  leads  to 
a  dilema  when  one  attempts  to  include  flicker  noise  in  a  Kalman  filter  clock 
model.  This  is  the  subject  of  a  companion  paper  in  these  Proceedings  [6], 
so  we  will  not  pursue  this  further  here. 

SUMMARY 

Various  aspects  of  Kalman  filtering  modeling  have  been  discussed  briefly  in 
this  paper.  Perhaps  the  most  important  thing  to  remember  is  that  the  random 
processes  under  consideration  must  be  modeled  in  vector  state-space  form. 
This  can  often  be  done  with  exact  methods.  If  the  exact  methods  discussed 
here  cannot  be  used,  as  in  the  case  of  flicker  noise,  then  one  must  seek 
approximate  finite-order  vector  models  in  order  to  form  a  workable  Kalman 
filter.  The  measurement  model  usually  does  not  cause  difficulty,  because  it 
simply  depends  on  what  state  variables  are  being  observed. 
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QUESTIONS  AND  ANSWERS 


VICTOR  REINHARDT,  HUGHES  AIRCRAFT  COMPANY:  I  think  you  are  right 
about  that  not  being  able  to  be  factored,  and  I  think  that  I  have 
a  reason  for  that.  You  can  show  that  flicker  noise  can  be 
mathematically  generated  by  the  sum  of  an  infinite  number  of 
gaussian  processes  where  the  beta  term  goes  from  zero  to 
infinity.  Therefore,  there  are  infinite  time  constants  in  the 
process.  So,  you  can't  give  a  state  vector  at  any  one  time, 
because  the  beta  term  goes  from  zero  to  infinity. 

MR.  BROWN:  I  agree  with  what  you  say.  I  think  that  it  fits  my 
intuition  to  think  the  same  thing,  and  I  have  read  that  paper 
that  you  wrote  on  it.  I  think  that  it's  a  very  nice  paper,  and  a 
nice  way  to  look  at  it. 

Other  people  have  also  approximated  flicker  noise  with  a 
cascaded  sequence  of  what  we,  in  control  system  engineering,  call 
lead  or  lag  networks,  which  gives  kind  of  a  staircase  sort  of 
frequency  response  function,  which,  to  a  certain  degree  of 
approximation,  drops  off  at  ten  dB  per  decade  rather  than  twenty 
dB. 

If  you  take  any  rational  transfer  function,  or  one  that  is 
written  out  in  integer  powers,  and  look  at  the  Bode  plot,  the 
slopes  go  in  multiples  of  twenty  dB  per  decade,  There  are  no 
thirty  dB  per  decade,  or  fifty  dB  per  decade  slopes. 

In  the  case  of  flicker  noise,  and  consider  the  filter  that 
shapes  white  noise  into  flicker  noise,  it  requires  an  s  to  the 
negative  one-half1  power  in  the  transfer  function.  That  would  give 
a  Bode  plot  that  drops  off  at  ten  dB  per  decade  instead  of 
twenty.  What  you  would  do  is  approximate  that  ten  dB  per  decade 
slope  with  a  whole  sequence  of  filters  with  alternating  zeros  and 
poles.  You  then  end  up  with  a  staircase  shape  response  which,  on 
the  average,  has  a  ten  dB  per  decade  slope. 

Incidentally,  I  think  that  this  is  a  very  good  way  to  model 
flicker  noise.  The  difficulty  is  that  every  time  you  put  a  new 
pole  in  the  system  you  have  a  new  state  model.  If  you  want  get  a 
reasonably  accurate  approximation  of  flicker  noise  that  way,  it 
does  involve  escalating  the  order  of  the  Kalman  filter 
considerably.  There  is  nothing  wrong  with  doing  it  off-line  for 
analysis  purposes.  I  think  that  there  are  some  on-line  cases 
where  it  would  not  be  accepted. 

MR.  REINHARDT:  I  think  that  some  people  have  reported  on  a 
similar  method  where  they  used  a  finite  number  of  filters  and  it 
worked  very  well  in  an  operational  case.  If  you  try  to  limit  that 
process  though,  what  happens  is  that  all  the  poles  run  together, 
and  you  end  up  with  a  branch  line. 

MR.  BROWN:  I  guess  my  answer  to  that  would  be  that,  in  any  of 
these  processes,  in  the  case  of  flicker  noise  for  example,  at 
zero  frequency  and  out  at  infinity,  there  are  singular  conditions 
for  either  case.  If  it  drops  off  as  one  over  f,  the  area  under 
the  curve  out  at  infinity  is  not  finite.  You  are  talking  about  a 
process  with  infinite  variance,  which  is  physically  ridiculous. 

The  same  thing  happens  at  the  other  end  of  the  spectrum,  the 
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area  under  the  curve  doesn't  converge  there,  either.  Physically 
it  makes  sense,  if  you  want  to  be  careful  and  talk  about 
processes  of  finite  variance,  that  you  have  to  bound  the  power 
spectral  density  at  the  low  frequency  end  and  at  the  high 
frequency  end.  It  has  to  roll  off  at  least  twenty  dB  per  decade 
in  order  to  have  a  process  of  finite  variance. 

It  doesn't  bother  me  to  think  of  putting  in  a  filter  at  the 
origin  which  will  bound  the  frequency  content  at  zero  frequency, 
and  also  put  one  in  at  the  high  end  and  make  it  roll  off  at  least 
twenty  dB  per  decade. 

Incidentally,  that  impulse  response  function  is  not  original 
with  me.  Other  people  have  written  about  that  before,  including 
yourself,  I  think. 

JIM  BARNES,  AUSTRON,  INC.:  I  have  done  a  fair  amount  of 
simulation  of  flicker  noise  with  polynomials,  the  lead-lag 
networks  you  mentioned,  and  have  one  comment  in  their  defense: 
Three  or  four  stages  can  do  an  amazing  amount.  You  can  cover  as 
much  as  three  to  four  decades  of  frequency  with  only  three  or 
four  stages. 

MR.  BROWN:  Oh,  is  that  right?  It  isn't  as  bad  as  it  might  appear 
at  first  glance  then.  I  haven't  used  it,  but  would  have  imagined 
that  you  would  need  a  fairly  large  number. 

MR.  REINHARDT:  As  another  comment,  even  a  single  filter,  which 
generates  a  random  telegraph,  will  generate  a  flat  Allan  variance 
of  about  two  orders  of  magnitude  in  tau,  right  around  the  peak. 
Then  you  really  have  to  put  a  pole  every  order  of  magnitude  or 
even  every  two  orders  of  magnitude. 

MR.  BROWN:  All  of  these  are,  of  course,  approximate  models  for 
the  reasons  which  I  just  cited. 

MR.  ALLAN:  I  think,  in  practice,  the  problem  with  flicker  noise 
is  not  a  serious  one,  because  it's  only  at  the  extremes,  as  you 
pointed  out,  at  zero  and  at  infinity  that  you  have  difficulties 
with  one  over  f  integration.  In  practice,  that's  not  where  the 
Fourier  frequencies  are.  In  reality,  a  few  stages  of  the  filter 
will  work  very  nicely  in  describing,  predicting  or  simulating  a 
flicker  process. 

MR.  BROWN:  You  need  something  like  that  though  as  far  as  the 
Kalman  filter  is  concerned.  You  can't  afford  to  have  these 
fractional  powers  of  s  is  you  are  going  to  do  the  state  model. 
You  have  to  have  something  where  you  only  need  to  worry  about 
integer  powers  of  s,  and  if  you  can  do  that  by  only  adding  two  or 
three  poles,  that  would  be  a  very  feasible  way  to  approximate  it 
certainly . 
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